-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement DataFrame.hash_values, deprecate DataFrame.hash_columns. #9458
Implement DataFrame.hash_values, deprecate DataFrame.hash_columns. #9458
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.12 #9458 +/- ##
================================================
- Coverage 10.79% 10.65% -0.14%
================================================
Files 116 117 +1
Lines 18869 19759 +890
================================================
+ Hits 2036 2106 +70
- Misses 16833 17653 +820
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Need to add tests though!
@skirui-source Good catch! I have now added tests. The tests for |
@gpucibot merge |
This PR removes the deprecated method `DataFrame.hash_columns`. Users can replace existing calls like `df.hash_columns(columns, method)` with `df[columns].hash_values(method)`. Resolves #9503, follows up on #9458. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - https://github.com/brandon-b-miller URL: #9943
This PR implements
DataFrame.hash_values
, which will replaceDataFrame.hash_columns
(which is deprecated in this PR). This proposal was discussed offline with @vyasr and in the weekly cuDF Python dev meeting.This unifies the method name and signature for
Series.hash_values
andDataFrame.hash_values
, enabling future internal refactoring by moving the method's implementation to theFrame
class (though I'm waiting for the removal ofSeries.hash_encode
to follow up on this so it can be done in a single pass, see #9381 and #9457).